[SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDTF/UDAF#18792
[SPARK-21589][SQL][DOC] Add documents about Hive UDF/UDTF/UDAF#18792maropu wants to merge 3 commits intoapache:masterfrom
Conversation
|
@gatorsmile If you get time, could you check this? Thanks! |
docs/sql-programming-guide.md
Outdated
| Some of them are meaningless in Spark and the others are rarely used by users. | ||
| Below is a list of major APIs we don't support in Spark SQL: | ||
|
|
||
| * `getRequiredJars` and `getRequiredFiles` (`UDF` and `GenericUDF`) are functions to to automatically |
docs/sql-programming-guide.md
Outdated
| * `initialize(StructObjectInspector)` in `GenericUDTF` is not supported yet. Spark SQL currently uses | ||
| a deprecated interface `initialize(ObjectInspector[])` only. | ||
| * `configure` (`GenericUDF`, `GenericUDTF`, and `GenericUDAFEvaluator`) is a function to initialize | ||
| functions with `MapredContext`. But, Spark SQL does not use `MapredContext` internally. |
There was a problem hiding this comment.
functions with
MapredContext, which is inapplicable to Spark.
docs/sql-programming-guide.md
Outdated
| * `configure` (`GenericUDF`, `GenericUDTF`, and `GenericUDAFEvaluator`) is a function to initialize | ||
| functions with `MapredContext`. But, Spark SQL does not use `MapredContext` internally. | ||
| * `close` (`GenericUDF` and `GenericUDAFEvaluator`) is a function to release associated resources. | ||
| Spark SQL does not call this function when tasks finished. |
docs/sql-programming-guide.md
Outdated
| * `reset` (`GenericUDAFEvaluator`) is a function to re-initialize aggregation for reusing the same aggregation. | ||
| Spark SQL currently does not support the reuse of aggregation. | ||
| * `getWindowingEvaluator` (`GenericUDAFEvaluator`) is a function to optimize aggregation by evaluating | ||
| an aggregate over a fixed window. Spark SQL does not support this optimization yet. |
There was a problem hiding this comment.
Please remove Spark SQL does not support this optimization yet
|
Test build #80103 has finished for PR 18792 at commit
|
docs/sql-programming-guide.md
Outdated
|
|
||
| Spark SQL implements the basic functionality of the Hive UDF/UDTF/UDAF, but does not support all the APIs for users. | ||
| Some of them are meaningless in Spark and the others are rarely used by users. | ||
| Below is a list of major APIs we don't support in Spark SQL: |
There was a problem hiding this comment.
How about simplifying the whole paragraph to?
Not all the APIs of the Hive UDF/UDTF/UDAF are supported by Spark SQL. Below are the unsupported APIs:
|
Thanks for working on it! Just left some minor comments. |
|
@gatorsmile ok, fixed. |
|
Test build #80106 has finished for PR 18792 at commit
|
|
Test build #80107 has finished for PR 18792 at commit
|
docs/sql-programming-guide.md
Outdated
| * `initialize(StructObjectInspector)` in `GenericUDTF` is not supported yet. Spark SQL currently uses | ||
| a deprecated interface `initialize(ObjectInspector[])` only. | ||
| * `configure` (`GenericUDF`, `GenericUDTF`, and `GenericUDAFEvaluator`) is a function to initialize | ||
| functions with `MapredContext`, which is inapplicable to Spark. But, Spark SQL does not use `MapredContext` internally. |
There was a problem hiding this comment.
nit: But looks redundant here, because there's inapplicable before. Looks like negative to negative...
|
LGTM pending Jenkins |
|
Test build #80109 has finished for PR 18792 at commit
|
What changes were proposed in this pull request?
This pr added documents about unsupported functions in Hive UDF/UDTF/UDAF.
This pr relates to #18768 and #18527.
How was this patch tested?
N/A